Regular expressions are pattern matching utilities found in most of the programming languages. They define a generic pattern to match a sequence of input characters. Regex are widely used in text parsing and search. The Regex class in scala is available in scala.util.matching package.
import scala.util.matching.Regex
object Demo {
def main(args: Array[String]) {
val p = "Functional".r
val st = "Scala is a Functional Programming Language"
println(p findFirstIn st)
}
}
In the above example we are finding the word “functional” . We invoke
the r() method which converts string to RichString and invokes the
instance of Regex. The findFirstIn
method finds the first occurrence of the pattern. To find all the occurrences use finadAllIn()
method.
If there is a match, scala returns an object. To return the actual string, we use mkString
.
The mkString
method concatenates the resulting set. Pipe (|) symbol can be used to
specify the OR search condition. For example, small and capital case of
the letter ‘S’ in the word ‘Scala’. Instead of using r() constructor the
Regex constructor can be used.
Consider an example using regex constructor;
import scala.util.matching.Regex
object multipleoccurence {
def main(args: Array[String]) {
val p = new Regex("(S|s)tudent")
val st = "Student Id is unique. Students are interested in learning new things"
println((p findAllIn st).mkString(","))
}
}
Above main method will produce output as;
Student,Student
The replaceFirstIn( )
can be used to replace the first occurrence of the matching word and replaceAllIn( )
replaces all the occurrences.
Consider an example below.
object Replace {
def main(args: Array[String]) {
val p = "Car".r
val st = "Car has power windows"
println(p replaceFirstIn(st, "Alto"))
}
}
Subexpression Matches ^ It is used to match starting point of the line. $ It is used to match terminating point of the line. . It is used to match any one character excluding the newline. […] It is used to match any one character within the brackets. [^…] It is used to match any one character which is not in the brackets. \\A It is used to match starting point of the intact string. \\z It is used to match terminating point of the intact string. \\Z It is used to match end of the whole string excluding the new line, if it exists. re* It is utilized to match zero or more appearances of the foregoing expressions. re+ It is used to match one or more of the foregoing expressions. re? It is used to match zero or one appearance of the foregoing expression. re{ n} It is used to matches precisely n number of appearances of the foregoing expression. re{ n, } It is used to match n or more appearances of the foregoing expression. re{ n, m} It is used to match at least n and at most m appearances of the foregoing expression. q|r It is utilized to match either q or r. (re) It is utilized to group the Regular expressions and recollects the text that are matched. (?: re) It also groups the regular expressions but does not recollects the matched text. (?> re) It is utilized to match self-reliant pattern in absence of backtracking. \\w It is used to match characters of the word. \\W It is used to match characters of the non-word. \\s It is utilized to match white spaces which are analogous to [\t\n\r\f]. \\S It is used to match non-white spaces. \\d It is used to match the digits i.e, [0-9]. \\D It is used to match non-digits. \\G It is used to match the point where the endmost match overs. \\n It is used for back-reference to occupy group number n. \\b It is used to match the word frontiers when it is out of the brackets and matches the backspace when it is in the brackets. \\B It is used to match non-word frontiers. \\n, \\t, etc. It is used to match the newlines, tabs, etc. \\Q It is used to escape (quote) each of the characters till \\E. \\E It is used in ends quoting starting with \\Q.
scala> val adder = "we're as similar as two dissimilar things in a pod.\n\t-blackadder"
adder: String =
we're as similar as two dissimilar things in a pod.
-blackadder
scala> adder.split("\\s+")
res0: Array[String] = Array(we're, as, similar, as, two, dissimilar, things, in, a, pod., -blackadder)
scala> adder.split("""\s+""")
res1: Array[String] = Array(we're, as, similar, as, two, dissimilar, things, in, a, pod., -blackadder)
scala> val name = """(mr|mrs|ms)\. ([a-z][a-z]+) ([a-z][a-z]+)""".r
name: scala.util.matching.Regex = (mr|mrs|ms)\. ([a-z][a-z]+) ([a-z][a-z]+)
scala> val name(title, first, last) = "mr. james stevens"
title: String = mr
first: String = james
last: String = stevens
scala> val name(title, first, last) = "ms. sally kenton"
title: String = ms
first: String = sally
last: String = kenton
scala> val array(title, first, last) = "mr. james stevens".split(" ")
<console>:27: error: not found: value array
val array(title, first, last) = "mr. james stevens".split(" ")
^
scala> val phone1 = """\((\d{3})\)\s*(\d{3})-(\d{4})""".r
phone1: scala.util.matching.Regex = \((\d{3})\)\s*(\d{3})-(\d{4})
scala> val phone2 = """(\d{3})-(\d{3})-(\d{4})""".r
phone2: scala.util.matching.Regex = (\d{3})-(\d{3})-(\d{4})
scala> val phone1(area, first3, last4) = "(123) 555-5555"
area: String = 123
first3: String = 555
last4: String = 5555
scala> val phone2(area, first3, last4) = "123-555-5555"
area: String = 123
first3: String = 555
last4: String = 5555
scala> val namesharemthreegroups = """(m(?:r|rs|s))\. ([a-z][a-z]+) ([a-z][a-z]+)""".r
namesharemthreegroups: scala.util.matching.Regex = (m(?:r|rs|s))\. ([a-z][a-z]+) ([a-z][a-z]+)
scala> val namesharemthreegroups(title, first, last) = "mr. james stevens"
title: String = mr
first: String = james
last: String = stevens
scala> val rhymename = """(mr|mrs|ms)\. ([a-z])([a-z]+) ([a-z])\3""".r
rhymename: scala.util.matching.Regex = (mr|mrs|ms)\. ([a-z])([a-z]+) ([a-z])\3
scala> val rhymename(title, firstinitial, firstrest, lastinitial) = "mr. john bohn"
title: String = mr
firstinitial: String = j
firstrest: String = ohn
lastinitial: String = b
scala> val rhymename2 = """(mr|mrs|ms)\. ([a-z]([a-z]+)) ([a-z]\3)""".r
rhymename2: scala.util.matching.Regex = (mr|mrs|ms)\. ([a-z]([a-z]+)) ([a-z]\3)
scala> val rhymename2(title, first, _, last) = "mr. john bohn"
title: String = mr
first: String = john
last: String = bohn
No comments:
Post a Comment